Monolingual Experiments with Far-East Languages in NTCIR-6
نویسندگان
چکیده
This paper describes our third participation in an evaluation campaign involving the Chinese, Japanese and Korean languages (NTCIR-6). Our participation is motivated by three objectives: 1) study the retrieval performances of various probabilistic and language models for these languages; 2) compare the relative retrieval effectiveness of a combined “unigram & bigram” indexing scheme combined with an automatic wordsegmenting approach for Chinese and Japanese languages; and 3) evaluate the relative performance of the various data fusion strategies used to combine separate result lists in order to enhance retrieval effectiveness.
منابع مشابه
MIRACLE Retrieval Experiments with East Asian Languages
This paper describes the participation of MIRACLE in NTCIR 2005 CLIR task. Although our group has a strong background and long expertise in Computational Linguistics and Information Retrieval applied to European languages and using Latin and Cyrillic alphabets, this was our first attempt on East Asian languages. Our main goal was to study the particularities and distinctive characteristics of J...
متن کاملExperiments in the Retrieval of Unsegmented Japanese Text at the NTCIR-2 Workshop
Our work with the Hopkins Automated Information Retriever for Combing Unstructured Text (HAIRCUT) system has made use of overlapping character n-grams in the indexing and retrieval of text. In previous experiments with Western European languages we have shown that longer length n-grams (e.g., n=6) are capable of providing an effective form of alinguistic term normalization. We have wanted to in...
متن کاملNTCIR-6 CLIR-J-J Experiments at Yahoo! Japan
This paper describes NTCIR-6 experiments of the CLIRJ-J task, i.e. Japanese monolingual retrieval subtask, at the Yahoo group, focusing on the parameter optimization in information retrieval (IR). Unlike regression approaches, we optimized parameters completely independent from retrieval models so that the optimized parameter set can illustrate the characteristics of the target test collections...
متن کاملNTCIR-6 Monolingual Chinese and English-Chinese Cross Language Retrieval Experiments using PIRCS
In NTCIR-6, our Stage-1 results which consist of using old queries retrieving on a different old collection, were not official because of late submission. Stage-2 submissions, which consists of repeating previous experiments, were on time. These NTCIR-6 experiments were conducted as new without referring to any previous knowledge about the runs. Comparisons with old results however were less fa...
متن کاملNTCIR-2 ECIR Experiments at Maryland: Comparing Structured Queries and Balanced Translation
Pirkola’s structured queries have been shown to perform well for word-based cross-language information retrieval in European languages, but in monolingual Chinese retrieval experiments it is often found that character bigrams perform as well as, and sometimes better than, automatically segmented words. During the Mandarin-English Information (MEI) project at the Johns Hopkins Summer 2000 Worksh...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007